A Different Way to Solve the Missing Value Problem : the Case of Equal Employment Opportunity Data
نویسنده
چکیده
Title of Document: THE MISSING VALUE PROBLEM: A REVIEW AND CASE STUDY Jing Zhou, M. A., 2006 Directed By: Professor Paul J. Smith, Statistics Program, Department of Mathematics. The purpose of this thesis is to review methods of imputation and apply them to data collected by Equal Employment Opportunity Commission (EEOC). First, I discuss several imputation methods and review theory of multiple imputation (MI). Next, I review aspects of missing data and outline an artificial data simulation. I describe simulation based on EEOC dataset listing numbers of employees by ethnicity in large establishments. Mean imputation and MI are applied to simulated datasets. In the first scenario, we impute data for nonresponding establishments. The more we impute, the higher our resulting population means. In the second scenario, we simulate item nonresponse. I find mean imputation and MI generate similar means. The means are not affected by percentage of missingness regardless of imputation methods. The results suggest MI produces larger standard error than mean imputation. Last the percentage of missingness has no effect on standard error in case of MI. A DIFFERENT WAY TO SOLVE THE MISSING VALUE PROBLEM: THE CASE OF EQUAL EMPLOYMENT OPPORTUNITY DATA.
منابع مشابه
A method to solve the problem of missing data, outlier data and noisy data in order to improve the performance of human and information interaction
Abstract Purpose: Errors in data collection and failure to pay attention to data that are noisy in the collection process for any reason cause problems in data-based analysis and, as a result, wrong decision-making. Therefore, solving the problem of missing or noisy data before processing and analysis is of vital importance in analytical systems. The purpose of this paper is to provide a metho...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملFlow Shop Scheduling Problem with Missing Operations: Genetic Algorithm and Tabu Search
Flow shop scheduling problem with missing operations is studied in this paper. Missing operations assumption refers to the fact that at least one job does not visit one machine in the production process. A mixed-binary integer programming model has been presented for this problem to minimize the makespan. The genetic algorithm (GA) and tabu search (TS) are used to deal with the optimization...
متن کاملObtaining a possible allocation in the bankruptcy model using the Shapley value
Data envelopment analysis (DEA) is an effective tool for supporting decision-makers to assess bankruptcy, uncertainty concepts including intervals, and game theory. The bankruptcy problem with the qualitative parameters is an economic problem under uncertainty. Accordingly, we combine the concepts of the DEA game theory and uncertain models as interval linear programming (ILP), which can be app...
متن کاملExploring the Role of Social Capital in Agricultural Entrepreneurial Opportunity Recognition: Application of Smart PLS
Although agriculture has a major role in economy, in general, and employment opportunities for unemployed population,in particular, limited ability of individuals to recognize entrepreneurial opportunities in agriculture has reduced the share ofagriculture in employment. Therefore, the enhancement of opportunity recognition skills among individuals in the agriculture sector is believed to solve...
متن کامل